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ABSTRACT 

A  hierarchical  procedure  is  described  for  computing  the 
discrete  correlation,  h^(x,y)  o  f(x,y),  between  a  function  f 
and  a  kernel  h^.  For  the  class  of  h  considered  here,  this 
correlation  is  equivalent  to  a  weighted  sum  of  correlations 
of  f(x,y)  with  the  kernel  h^^sr  h^(x/r,y/r)  .  Here  r>l,  so 
h£_^  is  narrower  than  h^  by  1/r.  The  correlations  of  f  with 
h^_^  are  themselves  computed  as  the  weighted  sums  of  correla¬ 
tions  with  still  narrower  kernels.  The  narrowest  kernel,  h^, 
has  unit  width,  and  its  correlation  with  f  is  f  itself. 

Kernels  which  can  be  computed  hierarchically  in  this  way 
closely  approximate  the  Gaussian  probability  distribution. 
Hierarchical  discrete  correlation  (HDC)  is  more  efficient  than 
direct  correlation  or  correlation  using  the  FFT ,  by  two  to 
three  orders  of  magnitude. 

The  HDC  has  immediate  applications  to  computer  image  pro¬ 
cessing.  Samples  of  the  correlations  obtained  at  nearby  image 
positions  can  be  summed  to  obtain  band  pass  Laplacian 
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and  oriented  first  and  second  derivative  operators  ("Mexican 
hat",  edge  and  bar  masks).  The  correlation  of  an  image  with 
many  operators  and  at  many  scales  requires  little  computation 
beyond  a  single  HDC. 
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1.  Introduction 


A  fundamental  and  frequent  task  in  computer  image  analysis 
is  the  computation  of  image  correlations.  Typically  one  image 
is  correlated  with  a  number  of  others  in  order  to  identify 
objects,  or,  in  the  case  of  stereopsis  and  motion  analysis,  to 
detect  object  displacements.  Still  more  often  relatively  small 
operators,  or  masks,  are  correlated  with  larger  images  in  order 
to  extract  local  properties  such  as  bar-  and  edge-like  features 
and  zero  crossings.  Unfortunately  correlations  are  computation¬ 
ally  expensive:  many  elementary  computational  steps  (adds  and 
multiplies)  are  required  for  each  location  in  the  image  at 
which  the  correlation  is  computed.  The  expense  is  compounded 
in  tasks  such  as  texture  or  motion  analysis  where  it  is  appro¬ 
priate  to  process  the  image  with  operators  of  many  sizes,  some 
of  which  cover  hundreds  or  even  thousands  of  image  pixels  [1,2). 

This  paper  describes  a  new  method  for  computing  correlations 
which  is  particularly  well  suited  for  image  processing.  The 
method,  called  hierarchical  discrete  correlation,  or  HDC ,  is 
computationally  efficient,  typically  requiring  two  or  three 
orders  of  magnitude  fewer  computational  steps  than  direct  cor¬ 
relation  or  correlation  computed  in  the  spatial  frequency  domain 
using  the  Fast  Fourier  transform.  In  addition  the  method  simul¬ 
taneously  generates  correlations  for  kernels  (operators)  of 
many  sizes.  These  kernels  closely  approximate  the  Gaussian 


-^probability  distribution,  so  that  the  correlation  is  equivalent 
to  low  pass  filtering.  The  operators  commonly  used  in  image 
processing  can  be  readily  obtained  from  sums  and  differences 

of  Gaussian-like  correlations  at  nearby  image  points.  — _ 

The  principle  underlying  the  HDC  is  that  the  correlation 
of  a  function  with  a  large  kernel  can  be  computed  as  a  weighted 
sum  of  correlations  with  smaller  kernels,  and  these  in  turn  can 
be  computed  as  weighted  sums  of  correlations  with  still  smaller 
kernels.  The  kernels  at  each  iteration  of  the  HDC  computation 
differ  in  size  by  a  factor  r,  which  we  call  the  order  of  the 
hierarchical  correlation,  but  not  in  shape. 

We  begin  by  defining  three  types  of  HDC.  Types  1  and  2 
are  for  integer  values  of  r,  while  type  3  is  for  fractional 
values.  Type  2  differs  from  type  1  only  in  that  an  even  rather 
than  an  odd  number  of  correlations  with  small  kernels  are  summed 
to  obtain  the  correlation  with  the  next  larger  kernel. 

Definitions  and  analyses  are  initially  given  for  HDC  in 
one  dimension,  and  then  it  is  shown  that  these  can  be  directly 
generalized  to  a  second  dimension.  Next  we  show  that  bandpass 
Laplacian  and  first  and  second  derivative  image  operators  can 
be  obtained  from  the  HDC.  Finally  we  compare  the  computational 
cost  of  the  HDC  with  standard  correlation  and  FFT  methods. 


2.  Hierarchical  Discrete  Correlation,  Type  1  (Odd) 

Suppose  f(x)  is  a  single  valued  function  of  x  which  is 

known  only  at  integer  values  of  x.  Suppose  also  that  w(x)  is 

a  discrete  weighting  function  which  is  defined  at  integral  x 

and  which  is  non-zero  only  for  -msxam.  We  then  define  the 

hierarchical  discrete  correlation  of  type  1  (HDCl)  as  a  set 

of  correlation  functions  g^(x)  which  are  obtained  from  f  and  w 

as  follows: 

(Type  1)  gQ(x)  =  f(x) 

m  l-l 

g^(x)  »  I  w(i)  g^,_1  (x-*-ir  x)  for  l> 1  (1) 

i=-m 

Here  l,  m  and  r  are  positive  integers. 

The  function  g^  is  obtained  from  f  through  t  recursions  of 
a  correlation-like  operation  using  the  weighting  function  w(x) . 
Thus  we  say  that  l  is  the  level  of  g^(x)  in  the  HDC  and  w(x)  is 
the  generating  kernel. 

Note  that  g^(x)  is  defined  as  a  sum  of  k=2m+l  values  of 

p_l 

^£-1  x  which  are  separated  by  multiples  of  the  distance  r 
This  sample  distance  grows  geometrically  by  the  factor  r  from 
level  to  level,  so  we  say  r  is  the  order  of  the  HDC;  k  is  called 
the  width  of  the  generating  kernel. 

The  HDCl  is  illustrated  graphically  in  Figure  1. 

Equivalent  Kernel 

The  function  g^  which  is  defined  recursively  in  Eq.  1  may 


also  be  defined  as  a  standard  correlation  of  f  with  an 


equivalent  kernel  h^: 
g^(x)  *  h^(x)  o  f (x) 

«  £  h/(i)f(x+i) 

i— * 


(2) 


The  equivalent  kernel  may  itself  be  defined  recursively,  and 


is  independent  of  f: 

for  x«0 
otherwise 


h0(x) 


{: 


m 


h^(x)  *  E  w(i)  h._.  (x-ir^-1-) 
i*-m  A'" 


(3) 


We  can  show  by  induction  that  the  definition  of  given  in 
Eq.  2  is  equivalent  to  that  given  in  Eq.  1.  First  observe  that 
the  limit  of  the  sum  in  Eq.  2,  M^,  is  just  that  interval  over 
which  h^  is  non-zero.  To  simplify  our  derivations  we  can  extend 
the  range  of  the  sum  without  changing  its  value.  (The  actual 
value  of  will  be  obtained  in  the  next  subsection.) 

For  £= 0  we  have  from  Eqs.  2  and  3 
gQ (x)  »  hQ (x)  o  f  (x) 

M0 

■  E  hn(i)f(x+i) 

i~M0 


■  E  h  (i)f(x+i) 

2,  m~co  W 


f  (x) 


This  result  agrees  with  the  first  part  of  Eq.  1.  Now  suppose 


that  Eq.  2  is  true  for  level  l-l .  We  need  to  show  that  it  is 
also  true  for  level  t. 


We  assume: 

^£-1  (x)  =  h£-l^  ° 

00 

=  Z  h£  (j)f(x+j) 

j=-o° 

Now  substitute  this  into  Eq.  1: 

m  *  l-l 

g Ax)  =  Z  w(i)  [  Z  ho  (j)f(x+irc  x+j)] 

^  i=-m  j  =—oo 

If  we  let  j=ir  +j  we  get: 

111  00  o  i-i 

g»(x)  =  Ew(i)[;  h£_  (j-ir4  i)f(x+j)] 

i=-m  j=-oo 

Thus  we  reorder  the  summations: 

g»(x)  «,Z  [  Z  w(i)hz  . ( j-ir  ^Jffx+j) 

j=-°°  i=-m 

So  from  Eq.  3  we  find 

CO 

g^(x)  =  Z  h^( j) f (x+j) 

j=-09 

=  h^x)  o  f(x) 

This  concludes  the  inductive  proof. 

While  Eqs.  1  and  2  give  equivalent  definitions, 
be  convenient  to  describe  the  correlations  in  terms 
valent  kernels,  but  compute  them  with  the  recursive 
Eq.  1. 


it  will 
of  equi- 
formulation 
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Width  of  the  equivalent  kernel 

Suppose  is  the  largest  value  of  x  for  which  h^(x)  is 


non -zero.  From  Eq.  3  we  see  that 

for  £X) 


and  Mq  =  0 

p-1  p-2 

Thus  =  mr-1,  +mr'c  +...+m 
i 

=  m  £  r 
i=0 


=  m 


(r£-l) 

(r-1) 


(4) 


The  width  k^  of  the  equivalent  kernel 


hi is 


h  -  “r1- 


Constraints 

We  now  adopt  four  constraints  on  the  generating  kernel 

which  are  designed  to  insure  that  the  equivalent  kernels  are 

monophasic,  symmetric,  and  centered  at  x=0. 

m 

Normalization  £  w(i)  =  1  (Cl) 

i=-m 

Symmetry  w(x)  =  w(-x)  for  all  x  (C2) 

Monophasic  w(x^)  a  w(x2)  s  0  for  0*x^*x2  (C3) 

m 

Equal  Contribution  £  w(j+ir)  =  constant  (=l/r)  for 

i=-m 


all  j,  0*j<r  (C4 ) 

The  fourth  constraint  insures  that  every  sample  point  of 
f (x)  contributes  with  equal  weight  to  every  level  of  the  HDC 
when  we  adopt  a  reduced  form  of  correlation.  (The  reduced 
form  will  be  defined  below.) 

The  properties  imposed  by  the  first  three  constraints  on 


the  generating  kernel  are  transferred  through  it  to  the  equiva¬ 


lent  kernels  as  well.  Thus  for  each  L,  h^(x)  is  normalized, 
symmetric  and  monophasic. 

Example  Type  1  HDC 

We  now  illustrate  the  hierarchical  discrete  correlation 
with  an  example  which  is  well  suited  for  image  processing.  We 
let  the  width  of  the  generating  kernel  be  5  (m=2)  and  the  order, 
r,  be  2. 

Let  w(0)=a,  w(l)=b,  and  w(2)=c.  Then  the  constraints  become 
w  ( — 1 )  =  w  ( 1)  =  b 

(symmetry) 

w (-2)  =  w  (2)  =  c 

a+2b+2c  =  1  (normalization) 

aibic^O  (monophasic) 

a+2c  =  2b  (equal  contribution) 

From  these  constraints  we  find 
l/4*a*l/2 
b  =  1/4 
c  =  1/4 -a/ 2 

One  free  parameter  remains  and  we  take  this  to  be  a.  For  each 
value  of  a  within  the  designated  range  we  obtain  a  different 
generating  kernel,  and  from  it  in  turn  a  set  of  equivalent 
kernels.  For  example,  when  a  =  .4,  b  =  .25  and  c  =  .05,  the 
kernels  h^,  h2,  and  h3  are  as  shown  in  Figure  2.  The  scales 
have  been  adjusted  in  these  graphs  to  aid  comparison:  in  each 

l 

case  the  amplitude  of  h^  has  been  increased  by  2  while  the  x 
scale  has  been  reduced  by  the  same  factor.  Thus  the  areas 


under  the  curves  have  been  preserved  (area=l) . 

Notice  that  h^(x)=w(x),  and  that  h3(x)  and  h^  (x)  are 
symmetric  and  monophasic.  In  addition  these  three  equivalent 
kernels  illustrate  a  critical  property  of  HDC : 

Observation  1  (Constant  shape) ;  The  shape  of  the  rescaled 
equivalent  kernel  converges  towards  a  characteristic  terminal 
form  as  the  level,  •£,  is  increased. 

Shape  convergence  is  very  rapid.  If  we  let  h^  designate 
the  characteristic  form  of  the  rescaled  kernels  then,  in  this 
example,  h3  deviates  from  h^  by  less  than  1%,  while  h4  deviates 
less  than  0.1%.  We  use  h^  as  an  estimate  of  h^  here  (Fig.  2d) 
and  in  the  remaining  examples  in  this  paper. 

The  shape  of  the  characteristic  kernel  depends  on  the 
generating  kernel  and  hence  on  the  value  assigned  to  the  free 
parameter  a.  Examples  for  a«0.5,  0.4  and  0.3  are  given  in 
Figure  3. 

When  a=0.5,  then  c=0,  so  the  width  of  the  generating  kernel 
becomes  3.  In  this  case  the  equivalent  kernel  is  triangular, 
as  shown  in  Figure  3a.  For  other  values  of  a,  h^  assumes  a 
"bell"  shape,  which  is  broader  for  smaller  values  of  a.  This 
illustrates  a  second  critical  property  of  the  HDC: 

Observation  2  (Gaussian-like) :Eauivalent  kernels  obtained  when 
the  generating  kernel  has  width  greater  than  3  approximate 
the  Gaussian  probability  distribution. 


The  closeness  of  this  approximation  depends  on  the  value 
of  a.  For  a=0.4  the  approximation  of  ha  to  a  Gaussian  is  par¬ 
ticularly  good,  as  is  shown  graphically  in  Figure  4.  As  a  is 
made  smaller  the  width  of  hM  becomes  greater.  To  characterize 
this  tendency  we  have  found  the  best  fit  (LSE)  Gaussian  to  h^ 
as  a  function  of  a.  The  standard  deviation,  a,  of  this  Gaussian 

is  shown  as  a  function  of  a  in  Figure  5a.  (a  is  shown  normalized 
L 

by  1/r  ,  so  does  not  change  with  level  of  the  HDC.)  The 
squared  error  between  h^  and  the  best  fit  Gaussian  is  shown 
in  Figure  5b.  Note  that  the  error  is  minimum  for  a=0 . 4 . 


3 .  HDC  Type  2  (Even) 


The  width  of  the  generating  kernel  for  type  1  hierarchical 
discrete  correlations  was  always  odd,  k=2m+l.  In  order  to  per¬ 
mit  kernels  of  even  width  we  must  change  the  definition  of  the 
correlation  slightly: 

(Type  2)  gQ (x)  =  f (x) 


m-1 


414  X  1  1 
g^(x)  =  Z  w (i  +  i)g^_1(x+(i  +  ±)rc  )  (5) 

X 


i=-m 

In  this  case  g^(x)  is  defined  for  integer  x  when  (r^-l) / (r-1) 
is  even  but  for  x  =  .. .-1/2, 1/2, 3/2  ...  when  it  is  odd.  The 
generating  kernel  is  also  defined  at  intermediate  values  of  x. 

To  illustrate  type  2  HDC's  we  consider  an  example  in  which 
r=2  and  m=2.  The  structure  of  the  computation  is  shown  graphi¬ 
cally  in  Figure  6.  Let  w(l/2)=a  and  w(3/2)=b.  Then  applying 
the  constraints  we  find: 
w (-1/2)  =  w ( 1/2)  =  a 


w  (-3/2)  =  w ( 3/2)  =  b 


symmetry 


2a+2b  =  1  normalization 

asb^O  monophasic 

(the  fourth  constraint,  equal  contribution,  is  automatically 
satisfied) . 

Thus  we  find: 

1/4  *a*l/2 
b  *  1/2-a 

Characteristic  kernels  are  shown  in  Figure  7  for  a  =  0.5, 


•  —  —■ — 


) 

0.4,  and  0.3.  For  a  =  0.5,  b  =  0,  and  the  generating  kernel 
degenerates  to  width  2,  and  the  characteristic  kernel,  hOT,  is 
square  shaped  (Fig.  7a) .  For  smaller  a  h^  becomes  bell  shaped 
and  Gaussian-like,  although  these  kernels  approximate  the  Gaussian 
function  less  closely  than  the  kernels  obtained  in  the  previous 
example  when  the  generating  kernel  width  was  5.  This  point  is 
made  in  Figure  8  where  we  plot  a  and  the  mean  squared  error  for 
the  best-fit  Gaussian  to  hB  as  a  function  of  a.  The  error  is 
relatively  small  only  over  a  narrow  range  of  a  near  a~.37. 


4 .  Reduced  HDC 


We  now  introduce  a  modification  to  the  hierarchical  discrete 
correlation  in  which  the  number  of  samples  of  g^(x)  is  reduced 
by  a  factor  of  r  from  level  to  level.  The  arrangement  of  nodes 
in  the  graph  which  represents  the  correlation  for  a  type  1  HDC, 
with  r=2,  is  shown  in  Figure  9a.  If  we  compare  this  to  Figure 
1  we  see  that  every  other  node  has  been  retained  at  level  1, 
every  fourth  node  at  level  2,  and  so  on.  However  the  correla¬ 
tions  defined  at  the  remaining  graph  nodes  (sample  points  of 
g^(x))  are  not  changed,  nor  is  the  generating  kernel,  w,  or 
the  equivalent  kernels,  h^.  It  will  be  convenient  to  change 
the  index  of  the  samples  of  g^(x)  as  follows:  Let  gn  ^  be  the 
nth  sample  of  g^(x)  measured  from  g^(0)  in  the  positive  x  direc- 

9 

tion.  Then  9n  ^  ®  g^(nr  ),  and  Eq.  1  becomes: 

(Type  1,  reduced)  g  -  *  f(n)  n  =  . . . -1, 0 , 1, 2 , . . . 

n,  u 


,4 


(6) 


For  Type  2  HDC  let 

1  r^-1  L 

*n,l  =  *  <1  T=r  +  nr  >• 

(See  Figure  9b.)  Then  Eq.  5  becomes: 


(Type  2,  reduced) 


gn,0  *  £<n) 
m-1 


(7) 


These  reduced  forms  of  the  HDC  have  a  distinct  advantage 
over  the  original  forms  in  that  they  are  obtained  with 


substantially  fewer  computation  steps  and  require  less  computer 
storage.  Fortunately  no  information  is  lost  in  adopting  the 
reduced  sample  density  provided  the  sample  interval  (=r^) 
satisfies  the  relationship 

r*  ‘  A 

where  w^  is  the  highest  spatial  frequency  component  of  g^(x) 

[3,  p.  70]. 

Let  G^,  and  F  be  the  Fourier  transforms  of  g^,  h^  and  f 
respectively.  Then 

G£(s)  =  H^(s)F*(s) 

(The  indicates  the  complex  conjugate.)  We  have  observed 

that  ho (x)  closely  approximates  a  Gaussian  distribution.  There¬ 
fore  can  be  reasonably  approximated  by  the  transform  of  the 
Gaussian.  In  particular,  if  the  Gaussian  has  standard  deviation 
a £,  then  its  transform  is  itself  a  Gaussian  and  has  standard 
deviation  a  =  -J\  .  As  a  fairly  conservative  estimate  of  the 


s  2iro, 


high  frequency  limit  of  H^,  and  hence  also  of  G^,  we  say 


=  2cV 


Then  Eq.  8  is  satisfied  when 


aZ  2 

T  *  If 

r 

In  fact  this  is  approximately  the  ratio  obtained  when  the  gene¬ 
rating  kernel  has  width  4  or  5  and  a  is  within  the  Gaussian-like 
region  (see  Fig.  5  and  8).  We  conclude  that  the  correlation 


functions  are  adequately  sampled  in  the  reduced  form  of  the  HOC. 


5 .  Type  3  HDC  (Fractional  Order) 

An  inherent  restriction  in  hierarchical  discrete  correlations 
of  types  1  and  2  is  that  the  order,  r,  must  be  a  positive  integer. 
The  smallest  order  of  interest  is  r=2  since  r=l  results  in  a 


process  which  is  recursive  but  not  hierarchical.  In  this  case, 

the  distance  between  level  t  samples  which  are  summed  to  obtain 

each  level  l+l  sample  does  not  increase  from  level  to  level. 

It  is  possible  to  obtain  fractional  orders,  with  r  between  1  and 

2,  in  the  following  way.  Suppose  r=k^/k2  where  k.^  and  k2  are 

integers  and  k2<k^<-2k2.  We  restrict  our  attention  to  the 

reduced  form  of  the  correlation  so  that  for  every  k^  samples  of 

g^,  there  will  be  k2  samples  of  g^^.  These  samples  are  computed 

t  t +1 

at  regular  intervals  (r  and  r  )  so  that  the  distance  from 
g^+1  ^  to  the  nearest  sample  point  of  g^  changes  with  i.  However 
the  same  nearest  neighbor  distance  is  obtained  when  i  is  increased 
modulo  k2 .  To  accommodate  these  k2  distance  relationships  we 
must  define  k2  different  generating  kernels  wfc(x),  t=0, . . . ,k2«l. 

If  we  position  the  samples  in  the  x  coordinate  so  that  g^ 
g^d-r^+x^  g)  where  x^  g=  (r^-1) /  (r-1)  ,  then  the  distances  be¬ 
tween  g^+1  ^  and  the  nearest  g^  sample  will  be  the  same  as  the 

distance  between  ]c  and  the  nearest  g^  sample,  but  in 

2  th 

the  opposite  direction.  In  this  case  the  t  generating  lfernel 
is  simply  a  mirror  reflection  of  the  (k2~t-l)th  kernel: 

“tu)  *  V-t-i'-*1- 


We  do  not  give  the  general  definition  of  the  type  3  HDC  since 
indexing  of  the  correlation  functions  and  generating  kernels  is 
rather  tedious.  Instead  we  illustrate  the  construction  with 
an  example  in  which  r=3/2.  In  this  case  there  are  two  generat¬ 
ing  kernels,  one  of  which  is  the  mirror  reflection  of  the  other. 
We  let  the  width  of  the  generating  kernels  be  5,  and  say 
wQ  (x^  *  *  a 

w0(x2J  “  wx(-x2)  =  b 

Wq  (JC3)  *  Wj^-Xj)  =  c 

W  *  w1(-x4)  =  d 


where 


W0(X5)  *  wl(_x5)  =  e 


-7/4 


-3/4 


(See  Figure  10.) 

The  constraints  on  w  are  the  same  as  in  previous  HDC* s  except 

that  the  symmetry  constraint  is  replaced  by  one  or  more  balance 

constraints.  Let  x  .  be  the  position  of  the  ith  member  of  the 

kernel  w  (x) .  Then  the  nth  order  balance  constraint  is 
t 


Z  !xt  i 

i,  tor  x.  ,<0  <:'1 


i,  for  x  .>0 


ut(xt,i> 


In  the  present  example  we  have 


a+b+c+d+e  ■  1  normalization 

a+c+d  -  2b+2e  equal  contribution 

cab*d»aie*0  monophasic 

7a+3b  »  c+5d+9e  1st  order  balance 

49a+9b  *  c+25d+81e  2nd  order  balance 
From  these  we  can  determine  the  variables  b,  c,  d,  and  e  in 
terms  of  one  free  variable,  a: 


e 


The  generating  kernels  obtained  subject  to  these  constraints 
are  irregular  in  appearance  (Fig.  11a)  but  the  equivalent  kernels 
are  Gaussian-like  and  nearly  symmetric  (Fig.  lib) .  The  standard 
deviations  and  errors  of  best  fit  Gaussians  to  hM  are  given  in 
Figure  12.  Note  that  type  3  HDC  with  r  =  3/2  and  kernel  width 
5  can  approximate  Gaussians  more  closely  than  the  type  1  and  2 
examples  described  earlier  (compare  Figure  12  with  Figures  5  and 


6.  HDC  in  two  dimensions 


The  three  types  of  hierarchical  discrete  correlations  which 
we  have  defined  thus  far  in  one  dimension  can  be  extended 
directly  to  a  second  dimension.  Assume  the  function  f(x,y)  is 
sampled  for  x,y  =  . . .-1, 0, 1, 2, . . . .  Then  Eq.  6  for  the  reduced 
type  1  HDC  becomes: 

9Vn y,l  ‘  9*(r\'r%> 

W°  *  “W 


9w£ 


JL  j.LW<i'3,9rV 


The  sample  positions  in  two  dimensions  for  types  1  and  2  HDC 
with  r=2  and  type  3  with  4=3/2  are  given  in  Figure  13.  (Other 
sample  arrangements  in  two  dimensions  are  shown  in  the  appendix.) 

Constraints  on  the  generating  kernel  also  extend  in  the 
natural  way: 

m  m 

Normalization  E  £  w(i,j)  =1 
i=-m  j=-m 


Symmetry  w(i,j)  =  w(-i,j)  =  w(i,-j)  =  w(-i,-j) 

(Type  1,2) 

Monophasic  w(i,j)  2  w(k,£)  for  ji|  *  jkj  and  |j|  *  |  £  | 


m  m  A  A  1 

Equal  Contribution  E  E  w (i+ir , 3+ jr) =constant (=l/r^ ) 


i=-m  ]=-m 


for  0*i, 3 <r 


An  important  special  case  is  that  in  which  the  generating  kernel 
is  separable: 

w(i , j )  =  wx(i) *Wy (j) 


Here  the  generating  kernel  w  satisfies  the  two-dimensional  con¬ 
straints  just  when  the  one  dimensional  kernels,  wx  and  w^, 
satisfy  the  one-dimensional  constraints.  Furthermore,  when  w 
is  separable  then  the  equivalent  kernels  are  also  separable: 
h^(x,y)  -  hlx(x)h^(y) 

where  h^x  and  h^  are  the  equivalent  kernels  obtained  in  one 
dimension  with  the  generating  kernels  wx  and  w^.  Thus  the  data 
summarized  in  Figures  5,  8,  and  12  can  be  applied  to  two  dimen¬ 
sions  when  the  kernel  is  separable. 

Suppose  wx  =  Wy  and  the  equivalent  kernel  ^£x^=^£y^  *-s 
approximately  Gaussian.  Then  h^(x,y)  will  approximate  a  two- 
dimensional  Gaussian  function  and  we  may  anticipate  that 
h^(x,y)  will  be  nearly  circularly  symmetric.  This  is  confirmed 
in  the  type  1  HDC  shown  in  Figure  14a,  for  which  r=2,  m=2,  and 
a-. 4.  On  the  other  hand,  if  Wx/Wy  we  obtain  an  equivalent  kernel 
which  is  elongated,  such  as  that  in  Figure  14b.  Again  r=2  and 

m=2,  but  in  this  case  w  and  w  are  determined  by  parameters 

x  y 

a  *.45  and  a  =.25. 
x  y 


HDC  for  Image  Processiiv 


We  now  show  how  the  HDC  can  be  used  to  construct  the  local 
operators  commonly  used  in  image  processing. 

Bandlimited  Laplacian,  L 

Laplacian-like  operators  have  long  been  defined  as  3x3 

weighting  functions  in  which  the  central  weight  is  positive 

(say  8)  while  the  surrounding  weights  are  negative  (say  -1) . 

Often,  however,  it  is  appropriate  to  convolve  operators  of 

various  sizes  with  an  image.  Such  operators  are  said  to  be 

band-limited  since  they  respond  to  details  of  the  image  which 

contain  a  limited  range  of  spatial  frequencies.  The  band -limited 

Laplacian  may  be  formally  defined  as  the  Laplacian  of  a  Gaussian, 
2 

V  G,  but  the  result  is  generally  approximated  as  the  difference 
between  two  Gaussian  functions  which  have  different  standard 
deviations : 


L(x,y)  =  -t— 


1_  e_ 
2ir 


2  2 
2a  j2 


_1_ 

2tt 


-(x2+v2) 

,  2“22 


Typically  the  ratio  t*ie  ran9e  1*5  to  2.5. 

A  band-limited  Laplacian  with  a  ratio  Oj/a^r  may  be  obtained 
from  the  type  1  HDC  of  order  r  simply  by  subtracting 
from  g^(x).  (If  the  reduced  form  of  the  HDC  is  used,  samples 
of  are  not  computed  for  every  sample  of  g^(x).  Missing 

g^+^  are  obtained  by  applying  w(x,y)  to  the  neighborhood  of 
g^(x) . ) 


To  obtain  ratios  c^/a^r  two  HDC  (any  type)  are  computed 
with  different  generating  kernels.  These  are  determined  from 
graphs  such  as  those  in  Figures  5,  8,  and  12  to  obtain  the  desired 
Q.  The  L  operator  is  then  simply  a  difference  between  correspond¬ 
ing  samples  in  the  two  HDC. 

A  third  procedure  yields  ratios  c^/a^r.  First  an  HDC  is 
obtained  (any  type)  to  form  the  central  Gaussian.  Then  the 
surround  Gaussian  is  obtained  by  applying  the  generating  kernel 
in  reverse.  For  type  1  we  have 


j  g  r+  E  E 

VV*  Vny'r  i~m  j- 


2  ^  w  (i,  j )  g 

E  E  ' J  ’n  -i  n  -j 

i=-m  j=-m  — | — ,  l+l 


The  sums  in  this  expression  are  understood  to  include  only 

n  -i  n  -j 

those  terms  for  which  -  and  — ^ —  are  integer  valued.  The 

r  i" 


equal  contribution  constraint  on  w  ensures  that  the  sum  will 

2  2 

have  a  total  weight  of  1/r  ,  hence  the  r  factor  normalizes  the 
sum  in  the  above  definition.  An  example  using  this  procedure 
is  shown  in  Figure  15.  Here  a  ratio  02/a ^=2. 5  is  obtained  with 
a  Type  1  HDC,  order  r=2,  width  k=2m+l=5,  and  with  a  separable 
generating  kernel  for  which  a=.4. 

Bandlimited  Derivatives,  Dl  and  D2 

To  obtain  band -limited  operators  for  first  and  second  deri¬ 
vatives  in  the  x  direction  we  simply  compute  the  difference 

l 

of  g^(x)  samples  which  are  separated  by  the  distance  r  in  the 
x  direction.  Using  the  reduced  form 


i>  j  »£ 


D1 


fi,j  ,1  +  gi,j-l,£  ~gi-l,j,£  “  gi-l,  j-1,  •£ 


and 


D2i»j  ,L  ~  j+k,£  ~1//2  (gi-l,  j+k,£  +  gi+l,j+k,£) 


Contour  maps  for  these  operators  are  shown  in  Figures  16  and 
17.  Again,  these  are  computed  from  a  type  1  HDC  with  r=2, 
k=2m+l=5,  and  a  separable  w,  with  a=.4. 


8.  Computational  Efficienc 


We  have  already  encountered  several  potential  advantages 
that  the  HOC  has  over  other  methods  of  computing  correlations 
with  Gaussian-like  kernels.  These  include  the  fact  that  cor¬ 
relations  are  simultaneously  obtained  for  a  number  of  kernels 
which  differ  in  size,  and  the  fact  that  only  a  small  generating 
kernel  need  be  specified  (or  stored  in  computer  memory)  even  to 
compute  correlations  with  very  large  equivalent  kernels.  In 
addition  the  structure  of  the  HDC  provides  a  convenient  mecha¬ 
nism  for  adjusting  the  sample  interval  to  the  scale  of  the  equi¬ 
valent  kernel.  We  will  now  show  that  the  HDC  is  also  computa¬ 
tionally  more  efficient  than  either  direct  correlation  or  corre¬ 
lations  computed  in  the  frequency  domain  using  the  Fast  Fourier 
transform  (FFT) . 

The  computational  expense  of  the  HDC  relative  to  other 
methods  depends  in  part  on  whether  only  the  correlation  at  one 
level  is  to  be  used,  or  all  correlations  up  to  and  including 
that  level  are  to  be  used,  and  on  whether  one  generates  the 
full  or  reduced  form  of  the  HDC.  We  consider  first  the  most 
costly  case,  in  which  a  full  HDC  is  computed  to  level  l,  and 
only  the  correlation  at  level  l  is  of  value.  We  wish  to  deter¬ 
mine  N  ,  the  number  of  elementary  operations  (additions, 

multiplications)  per  sample  of  g^(x).  Suppose  the  generating 

2 

kernel  has  width  k,  and  the  order  is  r.  There  will  be  k 


elementary  operations  from  level  1  to  Z  for  each  sample.  Thus 


the  number  of  operations 


This  number  should  be  compared  to  NCQR, 
per  sample  using  a  standard  correlation  with  the  equivalent 
kernel  h^.  The  width  of  h^  is  k^  =  2M^+1,  where  is  given 
in  Eq.  4.  Then 


(Here  we  assume  rs2.)  The  relative  efficiency  of  the  hierar¬ 
chical  to  standard  correlations  is  given  by  the  ratio 


N 

N 


HDC 

COR 


Typical  values  of  r  and  l  are  2  and  6  respectively.  In  this 
instance  the  standard  computation  requires  roughly  650  times 
as  many  elementary  operations  as  the  HDC. 

Next  we  compare  NHDC  to  NRFT,  the  per  location  cost  of 
computing  a  single  level  l  correlation  using  the  FFT .  A  one¬ 
dimensional  FFT  requires  N  log2N  elementary  operations,  where 
N  is  the  number  of  sample  elements  over  which  the  transform  is 

computed.  The  standard  method  for  computing  a  two-dimensional 
2 

FFT  requires  N  one-dimensional  transforms,  or  one  per  element 
of  the  transformed  array: 

^FFT  ^  log  2^ 

Notice  that  the  number  of  computations  per  location  depends  on 
the  array  dimensions  in  the  FFT  but  not  in  the  HDC.  The  rela¬ 
tive  cost  is  given  by 


NHCD  l  k2 

^FFT  ^  log2N 

As  an  example  assume  k=5,  1=6,  and  N=512.  Then  the  FFT  will 
require  more  elementary  operations  by  a  factor  of  about  40. 

Further  reduction  in  the  cost  of  the  HDC  relative  to  the 
other  methods  is  realized  when  one  wishes  to  obtain  correlations 
for  kernels  at  a  number  of  scales.  No  additional  computational 
steps  are  required  for  the  HDC,  while  a  separate  computation  is 
required  for  each  scale  with  the  other  methods. 

The  cost  of  the  correlation  can  be  cut  by  an  additional 
factor  of  almost  l  when  the  reduced  form  of  the  HDC  is  used. 

As  we  have  seen,  essentially  no  information  is  lost  by  using 
the  reduced  form. 


We  have  shown  that  correlation  of  a  function  f(x,y)  with 
certain  kernels  can  be  computed  hierarchically:  the  correlation 
with  a  large  kernel,  h£(x,y),  is  a  weighted  sum  of  correlations 
with  a  smaller  kernel  h^_^(x,y),  which  in  turn  are  weighted  sums 
of  correlations  with  still  smaller  kernels.  The  kernels  in  this 
sequence  differ  in  scale  by  factors  of  r  but  not  in  shape,  so 
that  h^_-^(x,y)  =  r  h^(x/r,y/r).  The  parameter  r  indicates  the 
order  of  the  hierarchy:  variations  on  the  hierarchical  discrete 
correlation  (HDC)  have  been  defined  for  fractional  r  between 
1  and  2,  and  for  integer  r  of  1  or  greater. 

The  members  of  the  class  of  kernels  which  can  be  computed 
hierarchically  closely  resemble  the  Gaussian  probability  distri¬ 
bution.  This  means  that  correlation  is  equivalent  to  low  pass 
filtering.  Also  the  two-dimensional  kernel  is  very  nearly 
circularly  symmetric. 

The  principal  advantage  of  the  HDC  method  is  that  it  is 
computationally  more  efficient  than  the  direct  correlation  and 
FFT  methods.  In  addition,  correlations  for  a  set  of  scaled 
kernels  are  computed  at  once,  without  any  need  to  construct  and 
store  large  kernels  or  kernels  of  different  shapes  and  sizes. 

These  properties  make  the  HDC  particularly  well  suited  for 
computer  image  analysis.  The  correlations  of  an  image  with 
sets  of  operators  which  differ  in  size  and  shape  can  be  obtained 
from  a  single  HDC.  These  include  the  bandpass  Laplacian 


and  oriented  first  and  second  derivative  operators. 

Several  other  correlation  methods  which  have  been  developed 
for  image  processing  may  now  be  regarded  as  special  cases  of 
the  HDC.  Rosenfeld  and  Thurston  [1]  describe  an  elegant  method 
for  computing  averages  within  square  regions  which  differ  in 
scale  by  powers  of  2.  This  is  equivalent  to  the  type  2  HDC  with 
order  r=2  and  generating  kernel  width  k=2  (see  Fig.  7a) .  Pratt 
et  al.  [4]  also  obtain  convolutions  with  large  kernels  by  re¬ 
peated  convolution  with  small  kernels.  (Convolutions  and  corre¬ 
lations  are  equivalent  under  a  reflection  of  the  kernel.) 

This  method  is  the  same  as  an  HDC  in  which  the  order  r=l .  The 
result  is  not  hierarchical  in  the  sense  we  describe  here  since 
the  equivalent  kernels  do  not  increase  geometrically  in  scale 
from  level  to  level  (or  even  arithmetically)  ,  nor  do  the 
kernels  obtained  at  each  iteration  agree  in  shape. 

The  reduced  form  of  the  type  2  HDC  with  r=2  and  generating 
kernel  width  2  is  equivalent  to  the  pyramid  structures  proposed 
by  Tanimoto  and  Pavlidis  [5]  and  others.  The  cone  structure 
of  Hanson  and  Riseman  [6]  is  like  a  reduced  Type  1  HDC  in  which 
the  generating  kernel  width  is  3  or  5,  and  the  order  is  2. 

However,  no  constraints  are  imposed  on  the  generating  kernel  in 
the  cone  so  equivalent  kernels  are  not  Gaussian-like. 

Finally,  the  reduced  HDC,  or  Gaussian-weighted  pyramid,  defined 
here  overcomes  an  awkward  property  of  traditional  pyramids,  that 
the  information  represented  at  each  level  changes  with  image 
position.  In  the  Gaussian-weighted  pyramid  values  at  each 


level  represent  discrete  samples  of  the  continuous  correlation 
function.  We  have  shown  that  this  function  is  band-limited  and 
that  the  sample  interval  is  sufficiently  small  so  that  no  infor¬ 
mation  is  lost  in  the  sampling  process.  This  is  true  regardless 
of  image  position. 


Appendix 

Type  4  HDC  (hexagonal  sampling  and  rotation) 

There  is  considerably  greater  flexibility  in  the  structure 
of  the  HDC  in  two  dimensions  than  in  one.  Two  extensions  will 
be  described  here:  the  use  of  hexagonally  arranged  samples, 
and  the  possibility  of  rotating  the  axes  of  the  generating  ker¬ 
nel  from  level  to  level  of  the  HDC.  We  shall  only  consider 
cases  in  which  there  is  a  single  generating  kernel  (unlike  type 
3  HDC) ,  and  shall  only  illustrate  the  computational  structure 
graphically  by  showing  the  arrangement  of  sample  and  generating 
kernel  nodes  at  two  successive  levels.  The  same  constraints 
apply  as  with  previous  HDC:  generating  kernels  should  be  norma¬ 
lized,  symmetric,  monophasic,  and  provide  for  equal  contribution. 
The  permissible  orders  of  the  HDC  are  determined  uniquely 

by  the  sampling  patterns.  In  particular,  r  is  a  possible  order 
2 

only  if  r  is  an  integer  and  r  is  the  distance  between  some 

pair  of  points  in  the  sampling  grid  [7] .  Permissible  orders 

2 

for  the  hexagonal  grid  are  r  =  3,4,7,...  and  for  the  rectan- 

2  2 
gular  grid  r  =  2,4,5,....  Examples  with  r  =3  and  4  (hexa- 

2 

gonal)  and  r  =2  and  5  (rectangular)  are  shown  in  Figure  18. 

In  these  diagrams  small  dots  show  the  positions  of  sample  points 
at  level  l  while  open  circles  show  sample  points  at  level  £+1. 

The  generating  kernels  are  centered  at  level  £+1  nodes,  and  the 
boundary  of  the  smallest  such  kernel  is  shown  by  line  segments. 
Extension  to  larger  kernels  is  simply  a  matter  of  expanding 


these  bounds;  the  arrangement  of  samples  is  unchanged.  Notice 
that  in  all  cases  shown  except  Figure  18a,  the  grid  of  level 
l+l  nodes  is  rotated  with  respect  to  level  l.  Rotations  are 
unavoidable  with  these  orders. 
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FIGURE  CAPTIONS 


Figure  1.  Graph  representation  of  a  type  1  hierarchical  dis¬ 
crete  correlation.  Here  nodes  represent  sample  points  of  the 
correlation  function  g£(x).  The  horizontal  position  of  the 
nodes  indicates  spatial  location,  x,  while  the  vertical  position 
indicates  level,  l.  The  generating  kernel  is  shown  as  a 
pattern  of  arrows  between  Successive  levels:  sample  values  at 
level  £  are  weighted  by  a,  b,  c  and  summed  to  obtain  the  value 
of  a  single  sample  at  level  £+1.  The  same  weighting  pattern 
is  applied  at  each  x  position  to  compute  all  g£+i (x)  samples 
from  g£(x)  samples.  Note  that  the  distance  between  the  sample 
points  which  contribute  to  a  sum  increases  as  r^,  where  the  order, 
r,  has  value  2  in  this  example. 

Figure  2.  Equivalent  kernels,  h£(x),  for  type  1  HDC.  In  this 
example  the  generating  kernel  has  width  5,  order  r=2 ,  and  weight 
parameter  a=0.4.  In  general  the  width  of  the  equivalent  kernel 
increases  by  a  factor  of  r  from  level  to  level,  while  the  ampli¬ 
tude  decreases  by  the  same  factor.  We  have  compensated  for 
this  effect  in  the  graphs  by  contracting  the  x  scale  and  expanding 
the  y  scale  by  r^.  Note  that  h]_(x)  is  identical  to  the  generat¬ 
ing  kernel,  and  that  as  £  is  increased,  the  shape  of  hi (x) 
converges  rapidly  to  a  continuous  function  which  is  similar  to 
the  Gaussian  probability  distribution. 

Figure  3.  Characteristic  forms,  h^fx),  of  the  rescaled  equiva¬ 
lent  kernels  for  three  values  of  the  weight  parameter,  a. 

Examples  are  for  a  type  1  HDC  in  which  the  generating  kernel 
has  width  5  and  r*2. 

Figure  4.  Comparison  of  the  equivalent  kernel  (solid  curve) 
with  the  Gaussian  probability  distribution  (dashed  curve) .  Here 
a=0.4,  and  the  Gaussian  which  most  closely  approximates  the 
equivalent  kernel  has  standard  deviation,  a=0.56j~.  This  is  the 
most  Gaussian-like  equivalent  kernel  that  can  be  obtained  with 
a  type  1  HDC  when  the  generating  kernel  width  is  5  and  r=2  (see 
Fig.  5) . 

Figure  5.  Shape  characteristics  of  type  1  equivalent  kernels. 

In  general  the  equivalent  kernel  becomes  broader  as  the  weight 
parameter,  a,  is  decreased.  This  effect  is  shown  in  Fig.  5a 
as  an  increase  in  the  standard  deviation,  a,  of  the  Gaussian 
function  which  most  closely  approximates  the  equivalent  kernel 
(least  squared  error) .  The  squared  error  between  this  Gaussian 
and  the  equivalent  kernel  is  shown  in  Fig.  5b.  Note  that  the 
error  is  minimum,  and  the  equivalent  kernel  most  Gaussian-like, 
for  a»0.4.  (The  error  is  normalized  by  rescaling  the  equivalent 
kernel  so  that  the  best  fit  Gaussian  has  unit  standard  deviation.) 
Here  the  generating  kernel  has  width  5  and  r=2. 


Figure  6.  Graph  representation  of  a  type  2  HDC .  See  Fig.  5 
for  explanation  and  comparison  to  type  1  HDC. 

Figure  7.  Characteristic  forms  for  type  2  HDC.  Forms  are 
shown  for  three  values  of  the  weighting  parameter,  a,  when  the 
generating  kernel  has  width  4  and  the  order  r=2.  (Compare  with 
Fig.  3.) 

Figure  8.  Shape  characteristics  of  type  2  equivalent  kernels. 

The  standard  deviation  of  the  best  fit  Gaussian  is  shown  in 
(a)  while  its  squared  error  is  shown  in  (b) .  (Compare  with 
type  1,  Fig.  5.)  Here  the  generating  kernel  has  width  4  and  r=2 . 

Figure  9a.  Reduced  type  1  HDC.  The  reduced  form  of  the  HDC  is 
like  the  standard  form  (Fig.  1)  except  that  only  nodes  spaced  by 
r£  are  included  at  level  l.  Computations  at  these  points  are 
identical  in  both  forms . 

Figure  9b.  Reduced  type  2  HDC.  Compare  with  the  standard  form. 
Fig.  6. 

Figure  10.  Type  3  HDC  with  order  r=3/2 .  This  graph  illustrates 
the  construction  of  a  fractional  order  hierarchical  correlation. 
Here  there  are  2  nodes  at  level  i  for  every  3  at  level  l- 1, 
and  there  are  two  generating  kernels,  one  of  which  is  the  left- 
to-right  reflection  of  the  other. 

Figure  11.  Generating  and  equivalent  kernels  for  order  r=3/2, 
type  3  HDC.  The  generating  kernel  in  Fig.  11a  is  one  of  two 
required  for  this  HDC,  which  are  left-to-right  reflections  of 
aie  another  (see  Fig.  10) .  The  characteristic  form  of  the  equi¬ 
valent  kernel  is  shown  as  a  solid  curve  in  Fig.  lib  while  the 
best  fit  Gaussian  function  is  shown  as  a  dashed  curve.  In  this 
example  the  weighting  parameter  a=0.085. 

Figure  12.  Shape  characteristics  of  type  3  equivalent  kernels. 
The  standard  deviation  of  the  best  fit  Gaussian  is  shown  in 
Fig.  12a  while  its  squared  error  is  shown  in  Fig.  12b.  (Compare 
with  type  1,  Fig.  5,  and  type  2,  Fig.  8.)  Here  the  generating 
kernel  has  width  5  and  r=3/2. 

Figure  13.  Node  positions  in  two  dimensions.  Dots  represent 
level  i  nodes  while  open  circles  indicate  the  relative  position 
of  level  l+l  nodes. 

Figure  14a.  Contour  map  for  a  type  1  two-dimensional  equivalent 
kernel.  This  example  was  obtained  with  a  separable  generating 
kernel  of  width  5  and  equal  weighting  patterns  in  both  X  and  Y 
directions,  ax  =  a  *  0.4.  (The  magnitude  has  been  scaled  so 
that  its  maximum  is  1.) 


Figure  14b.  Contour  map  for  an  elongated  type  1  equivalent 
kernel.  As  in  Fig.  14a,  this  example  was  obtained  with  a 
separable  generating  kernel,  but  here  different  weighting  pat¬ 
terns  were  applied  in  the  X  and  Y  directions,  a*  =  0.45  and 
ay  =  0.25. 

Figure  15.  Band  limited  Laplacian  operator.  Laplacian-like 
operators  are  obtained  as  the  difference  of  Gaussian-like 
operators  computed  at  successive  levels  of  the  HDC .  This 
example  is  a  type  1  HDC  with  width  5  generating  kernel,  r=2 
and  a=0.4. 

Figure  16.  Lowpass  first  derivative  operator.  This  operator 
is  obtained  as  the  weighted  sum  of  four  neighboring  samples  in 
a  type  1  HDC,  as  explained  in  the  text.  The  upper  figure 
shows  a  contour  map  and  the  lower  figure  a  horizontal  cross- 
section  of  the  operator. 

Figure  17.  Band  limited  second  derivative  operator.  This 
operator  is  the  weighted  sum  of  nine  neighboring  samples  of  the 
type  1  HDC.  The  upper  figure  shows  a  contour  map  and  the  lower 
figure  a  horizontal  cross  section  of  the  operator. 

Figure  18a, b.  Examples  of  low  order  hexagonal  sample  patterns 
for  type  4  HDC.  In  each  case  closed  circles  represent  level  l 
sample  points,  while  open  circles  represent  level  t+ 1  sample 
points.  Lines  indicate  the  bounds  of  the  smallest  allowed 
generating  kernels.  One  such  kernel  is  shown  with  heavy  lines 
and  the  distribution  and  values  of  its  weights  are  indicated. 

Figure  18c, d.  Fractional  order  rectangular  sample  patterns 
for  type  4  HDC.  (Compare  to  Fig.  13.) 
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